Breaking Bad: A Brief Overview

Breaking Bad is an American crime drama created by Vince Gilligan. It aired on AMC from January 20, 2008 to September 29, 2013. The series follows Walter White, a high-school chemistry teacher turned meth manufacturer, and his former student Jesse Pinkman.

Breaking Bad logo

1. Data Import & Cleaning

I begin by scraping the episode tables from Wikipedia and performing basic cleaning so that all columns align properly.

# 1) Scrape data from Wiki
url    <- "https://en.wikipedia.org/wiki/List_of_Breaking_Bad_episodes"
page   <- read_html(url)
tables <- page %>% html_nodes("table.wikiepisodetable") %>%
          html_table(header = 2, fill = TRUE)

tables_ep <- keep(tables, ~ "No.overall" %in% names(.x))

tables_clean <- map(tables_ep, function(tbl) {
  names(tbl) <- make.names(names(tbl))
  tbl[]      <- lapply(tbl, as.character)
  tbl
})

df_raw <- bind_rows(tables_clean, .id = "Season") %>%
  mutate(Season = as.integer(Season))

vcol <- grep("viewers", names(df_raw), ignore.case=TRUE, value=TRUE)[1]

# Strip out citations and parse numeric
df_raw <- df_raw %>%
  mutate(
    Viewers = as.numeric(str_remove_all(.data[[vcol]], "\\[.*?\\]"))
  )


# Filter only to seasons 1–5
df <- df_raw %>%
  transmute(
    Season         = Season,
    EpisodeOverall = as.integer(str_extract(No.overall, "\\d+")),
    Title          = Title,
    Viewers        = Viewers
  ) %>%
  arrange(Season, EpisodeOverall) %>%
  filter(Season <= 5)

Quick look at the cleaned data:

knitr::kable(head(df), caption="*Table: First 6 episodes of the cleaned dataset*")
Table: First 6 episodes of the cleaned dataset
Season EpisodeOverall Title Viewers
1 1 “Pilot” 1.41
1 2 “Cat’s in the Bag…” 1.49
1 3 “…And the Bag’s in the River” 1.08
1 4 “Cancer Man” 1.09
1 5 “Gray Matter” 0.97
1 6 “Crazy Handful of Nothin’” 1.07

Data summary:

summary(df)
##      Season      EpisodeOverall     Title              Viewers      
##  Min.   :1.000   Min.   : 1.00   Length:64          Min.   : 0.970  
##  1st Qu.:2.000   1st Qu.:16.25   Class :character   1st Qu.: 1.323  
##  Median :3.000   Median :31.50   Mode  :character   Median : 1.650  
##  Mean   :3.344   Mean   :31.50                      Mean   : 2.236  
##  3rd Qu.:5.000   3rd Qu.:46.75                      3rd Qu.: 2.268  
##  Max.   :5.000   Max.   :62.00                      Max.   :10.280  
##                  NA's   :2                          NA's   :2

2. Summary Statistics

Next, I compute season‐level summaries: number of episodes and average viewers.

3. Episode‐Level Trend

Plot 1 shows viewership for each episode across all five seasons. Notice the clustering and spread.

p1 <- ggplot(df, aes(x=EpisodeOverall, y=Viewers, color=factor(Season))) +
  geom_line() +
  geom_point() +
  labs(
    title = "Viewers per Episode (All Seasons)",
    x     = "Episode (Overall)",
    y     = "Viewers (millions)",
    color = "Season"
  ) +
  theme_minimal(base_size=13)

ggplotly(p1, tooltip=c("x","y","color")) %>%
  layout(hovermode="closest") %>%
  config(displayModeBar=FALSE)

4. Season‐Level Trend

Plot 2 shows the average viewership per season, highlighting an overall upward trajectory.

p2 <- ggplot(season_summary, aes(x=Season, y=Avg.Viewers)) +
  geom_line(size=1.2, linetype="dashed", color="#205237") +
  geom_point(size=4, color="#205237") +
  labs(
    title = "Average Viewers per Season",
    x     = "Season",
    y     = "Viewers (M)"
  ) +
  theme_minimal(base_size=14)

ggplotly(p2, tooltip=c("x","y")) %>%
  layout(hovermode="x unified") %>%
  config(displayModeBar=FALSE)

5. Season‐to‐Season Change

Plot 3 illustrates the change in average viewership between consecutive seasons.

season_summary <- season_summary %>%
  mutate(Delta = round(Avg.Viewers - lag(Avg.Viewers), 2))

p3 <- ggplot(filter(season_summary, Season > 1),
             aes(x=Season, y=Delta, text=paste0("Δ=",Delta," M"))) +
  geom_col(fill="#2d4d00", alpha=0.8) +
  labs(
    title = "Change in Average Viewers Between Seasons",
    x     = "Season",
    y     = "Δ Viewers (M)"
  ) +
  theme_minimal(base_size=14)

ggplotly(p3, tooltip="text") %>%
  config(displayModeBar=FALSE)

6. Insights & Conclusion

Breaking Bad shows a clear growth pattern:

From the modest 1.23 million average viewers in Season 1, the audience steadily increased each year, climbing to 1.31 million (Season 2), 1.52 million (Season 3), and 1.87 million (Season 4). The largest leap occurs between Seasons 4 and 5, with an increase of 2.45 million viewers to reach an impressive 4.32 million average viewers.

This trend reflects how the series gained momentum, drawing in more viewers, culminating in a finale that became a cultural phenomenon.

Final Thought:
Breaking Bad’s Breaking Bad’s escalating viewership mirrors its narrative arc, starting small, intensifying mid run, and exploding at the climax.